Optimizing the Expected Mean Payoff in Energy Markov Decision Processes
نویسندگان
چکیده
Energy Markov Decision Processes (EMDPs) are finite-state Markov decision processes where each transition is assigned an integer counter update and a rational payoff. An EMDP configuration is a pair s(n), where s is a control state and n is the current counter value. The configurations are changed by performing transitions in the standard way. We consider the problem of computing a safe strategy (i.e., a strategy that keeps the counter non-negative) which maximizes the expected mean payoff.
منابع مشابه
Games and Markov Decision Processes with Mean-Payoff Parity and Energy Parity Objectives
In this paper we survey results of two-player games on graphs and Markov decision processes with parity, mean-payoff and energy objectives, and the combination of mean-payoff and energy objectives with parity objectives. These problems have applications in verification and synthesis of reactive systems in resource-constrained environments.
متن کاملEnergy and Mean-Payoff Parity Markov Decision Processes
We consider Markov Decision Processes (MDPs) with mean-payoff parity and energy parity objectives. In system design, the parity objective is used to encode ω-regular specifications, and the mean-payoff and energy objectives can be used to model quantitative resource constraints. The energy condition requires that the resource level never drops below 0, and the mean-payoff condition requires tha...
متن کاملOptimizing Expectation with Guarantees in POMDPs
Our Contribution Optimizing the expected discounted-sum payoff in a given POMDP while guaranteeing a payoff of at least a given threshold. In safety critical applications, the worst-case behaviour has priority, yet we would still like to maximize the expected payoff Inspired by the beyond worst-case approach from formal verification Solution implemented as an extension of the partially-observab...
متن کاملA Class of Markov Decision Processes with Pure and Stationary Optimal Strategies
We are interested in the existence of pure and stationary optimal strategies in Markov decision processes. We restrict to Markov decision processes with finitely many states and actions and infinite duration. In a Markov decision process, each state is labelled by an immediate payoff and each infinite history generates a stream of immediate payoffs. The final payoff associated with an infinite ...
متن کاملOptimizing Expectation with Guarantees in POMDPs (Technical Report)
A standard objective in partially-observable Markov decision processes (POMDPs) is to find a policy that maximizes the expected discounted-sum payoff. However, such policies may still permit unlikely but highly undesirable outcomes, which is problematic especially in safety-critical applications. Recently, there has been a surge of interest in POMDPs where the goal is to maximize the probabilit...
متن کامل